Can AI improve itself?

A blog to make LLMs rating and refining themselves
LLMs
prompt
Author

Emoo

Published

February 20, 2024

DALL·E 2024-05-16 08.56.44 - A surreal, imaginative scene where an artificial intelligence is depicted as a futuristic humanoid robot with a digital face, standing in front of a l.png

Can we make LLMs to prompt itself and refine its prompt using its own criteria to realize self improvement?

Creating a good GPT prompt involves clarity, context, specificity, and structure. Here are some guidelines to help you craft effective prompts:

1. Be Clear and Concise

  • Specific Request: Clearly state what you want. Avoid ambiguous language.
    • Example: Instead of “Tell me about AI,” say “Explain the history and development of artificial intelligence from the 1950s to the present.”

2. Provide Context

  • Background Information: Give enough context so the model understands the scope of the request.
    • Example: “For a presentation to high school students, explain the basic concepts of machine learning in simple terms.”

3. Ask Direct Questions

  • Focused Queries: Ask specific questions to get precise answers.
    • Example: “What are the main differences between supervised and unsupervised learning?”

4. Set the Tone and Style

  • Desired Style: Specify the tone, style, or format you want.
    • Example: “Write a humorous story about a day in the life of a robot.”

5. Use Examples

  • Illustrate Requests: Provide examples to clarify your needs.
    • Example: “Generate a professional email response to a job offer, similar to this: [include example].”

6. Limit the Scope

  • Specificity: Narrow down broad topics to specific aspects.
    • Example: Instead of “Discuss climate change,” say “Explain the impact of climate change on polar bear populations.”

7. Use Proper Formatting

  • Organized Prompts: Break down complex requests into bullet points or numbered lists.
    • Example:
      1. “Describe the main features of Python programming language.”
      2. “Give examples of its applications in data science.”

Examples of Good Prompts

  1. Descriptive Prompt
    • “Describe the economic impact of the COVID-19 pandemic on small businesses in the United States.”
  2. Creative Prompt
    • “Write a short sci-fi story about humans colonizing Mars and encountering alien life.”
  3. Technical Prompt
    • “Explain the difference between deterministic and stochastic models in operations research.”
  4. Instructive Prompt
    • “Provide a step-by-step guide on how to set up a WordPress blog for beginners.”
  5. Comparative Prompt
    • “Compare and contrast the leadership styles of Steve Jobs and Elon Musk.”

By following these guidelines, you can create effective GPT prompts that yield informative and relevant responses.

Then I prompt a criteria of score 0-10 to evaluate how good a prompt is.

Evaluating the quality of a prompt can be done using a set of criteria, each scored on a scale from 0 to 10. Here are the criteria with descriptions:

1. Clarity

  • Description: Is the prompt clear and easy to understand?
  • Score 0-10: 0 (very confusing) to 10 (perfectly clear)

2. Context

  • Description: Does the prompt provide enough background information for the model to understand the scope?
  • Score 0-10: 0 (no context) to 10 (excellent context)

3. Specificity

  • Description: Is the prompt specific enough to get a precise response?
  • Score 0-10: 0 (very vague) to 10 (highly specific)

4. Directness

  • Description: Does the prompt ask direct questions or make direct requests?
  • Score 0-10: 0 (very indirect) to 10 (very direct)

5. Tone and Style

  • Description: Does the prompt specify the desired tone, style, or format?
  • Score 0-10: 0 (no tone/style specified) to 10 (tone/style clearly specified)

6. Examples

  • Description: Does the prompt provide examples to illustrate the request?
  • Score 0-10: 0 (no examples) to 10 (excellent examples provided)

7. Scope Limitation

  • Description: Does the prompt narrow down broad topics to specific aspects?
  • Score 0-10: 0 (very broad) to 10 (well-defined scope)

8. Formatting

  • Description: Is the prompt well-organized, using bullet points or lists where necessary?
  • Score 0-10: 0 (poorly formatted) to 10 (well-organized)

9. Relevance

  • Description: Is the prompt relevant to the intended use or audience?
  • Score 0-10: 0 (not relevant) to 10 (highly relevant)

10. Creativity (if applicable)

  • Description: Does the prompt encourage creative or unique responses (for creative tasks)?
  • Score 0-10: 0 (not creative) to 10 (very creative)

Evaluation Example

Prompt Example: “Write a humorous story about a robot who learns to cook and ends up becoming a celebrity chef.”

Scoring

  1. Clarity: 10 (very clear)
  2. Context: 8 (some context provided)
  3. Specificity: 9 (specific task)
  4. Directness: 10 (direct request)
  5. Tone and Style: 10 (humorous tone specified)
  6. Examples: 0 (no examples given)
  7. Scope Limitation: 9 (well-defined scope)
  8. Formatting: 8 (clear but could be listed)
  9. Relevance: 10 (relevant for creative writing)
  10. Creativity: 10 (very creative task)

Total Score: 84/100

This prompt scores highly in most categories, making it an effective and high-quality prompt for generating a humorous story about a robot learning to cook.

Please give me three example prompts to generate a post about LLMs

Here are three example prompts to generate a post about Large Language Models (LLMs):

Prompt 1: Informative Overview

“Write a detailed post explaining what Large Language Models (LLMs) are, including their history, development, key features, and applications in various industries. Make sure to highlight the significance of LLMs in modern AI research and provide examples of popular LLMs such as GPT-4, BERT, and T5.”

Prompt 2: Technical Deep Dive

“Create a technical blog post that delves into the architecture and functioning of Large Language Models (LLMs). Discuss how LLMs are trained, the underlying neural network structures, and the challenges faced during their development. Include discussions on concepts like transformers, attention mechanisms, and fine-tuning, using examples from well-known models like GPT-4 and BERT.”

Prompt 3: Impact and Ethical Considerations

“Write a comprehensive post discussing the impact of Large Language Models (LLMs) on society and the ethical considerations associated with their use. Cover topics such as data privacy, potential biases in LLMs, and the implications for various sectors like healthcare, education, and entertainment. Include real-world examples and potential solutions to address ethical challenges.”

These prompts provide a clear focus, context, and specific aspects to cover, ensuring that the generated posts are informative, relevant, and engaging.

Evaluate them using the given criteria

Let’s evaluate each prompt using the criteria provided, scoring each criterion from 0 to 10.

Prompt 1: Informative Overview

“Write a detailed post explaining what Large Language Models (LLMs) are, including their history, development, key features, and applications in various industries. Make sure to highlight the significance of LLMs in modern AI research and provide examples of popular LLMs such as GPT-4, BERT, and T5.”

Scoring

  1. Clarity: 10 (very clear)
  2. Context: 9 (provides good background information)
  3. Specificity: 9 (specific aspects like history, development, features, applications)
  4. Directness: 10 (direct request)
  5. Tone and Style: 8 (detailed but tone/style not explicitly mentioned)
  6. Examples: 10 (provides specific examples like GPT-4, BERT, T5)
  7. Scope Limitation: 9 (well-defined scope covering multiple aspects)
  8. Formatting: 8 (clear but not in list format)
  9. Relevance: 10 (relevant to understanding LLMs)
  10. Creativity: 7 (informative but not highly creative)

Total Score: 90/100

Prompt 2: Technical Deep Dive

“Create a technical blog post that delves into the architecture and functioning of Large Language Models (LLMs). Discuss how LLMs are trained, the underlying neural network structures, and the challenges faced during their development. Include discussions on concepts like transformers, attention mechanisms, and fine-tuning, using examples from well-known models like GPT-4 and BERT.”

Scoring

  1. Clarity: 10 (very clear)
  2. Context: 9 (provides good technical context)
  3. Specificity: 10 (specific technical aspects covered)
  4. Directness: 10 (direct request)
  5. Tone and Style: 9 (technical but tone/style not explicitly mentioned)
  6. Examples: 10 (provides specific examples like GPT-4, BERT)
  7. Scope Limitation: 10 (well-defined technical scope)
  8. Formatting: 8 (clear but not in list format)
  9. Relevance: 10 (relevant to technical understanding of LLMs)
  10. Creativity: 7 (technical but not highly creative)

Total Score: 93/100

Prompt 3: Impact and Ethical Considerations

“Write a comprehensive post discussing the impact of Large Language Models (LLMs) on society and the ethical considerations associated with their use. Cover topics such as data privacy, potential biases in LLMs, and the implications for various sectors like healthcare, education, and entertainment. Include real-world examples and potential solutions to address ethical challenges.”

Scoring

  1. Clarity: 10 (very clear)
  2. Context: 10 (provides excellent context)
  3. Specificity: 10 (specific ethical aspects covered)
  4. Directness: 10 (direct request)
  5. Tone and Style: 8 (comprehensive but tone/style not explicitly mentioned)
  6. Examples: 9 (mentions real-world examples but not specific ones)
  7. Scope Limitation: 10 (well-defined scope covering ethical implications)
  8. Formatting: 8 (clear but not in list format)
  9. Relevance: 10 (relevant to societal impact and ethics)
  10. Creativity: 8 (covers ethical challenges creatively)

Total Score: 93/100

Summary

  • Prompt 1: 90/100
  • Prompt 2: 93/100
  • Prompt 3: 93/100

All three prompts are strong, with Prompts 2 and 3 scoring slightly higher due to their specificity and relevance to technical and ethical aspects, respectively.

refine each prompt to score better

Here are the refined versions of each prompt to improve their scores:

Refined Prompt 1: Informative Overview

“Write a detailed and engaging post explaining what Large Language Models (LLMs) are. Include their history, development, key features, and applications in various industries. Highlight the significance of LLMs in modern AI research and provide examples of popular LLMs such as GPT-4, BERT, and T5. Use a professional and informative tone, and include bullet points or subheadings for better readability.”

Scoring

  1. Clarity: 10 (very clear)
  2. Context: 9 (good background information)
  3. Specificity: 9 (specific aspects like history, development, features, applications)
  4. Directness: 10 (direct request)
  5. Tone and Style: 10 (professional and informative tone specified)
  6. Examples: 10 (provides specific examples like GPT-4, BERT, T5)
  7. Scope Limitation: 9 (well-defined scope covering multiple aspects)
  8. Formatting: 10 (suggests bullet points or subheadings)
  9. Relevance: 10 (relevant to understanding LLMs)
  10. Creativity: 8 (informative and engaging)

Total Score: 95/100

Refined Prompt 2: Technical Deep Dive

“Create a detailed technical blog post that delves into the architecture and functioning of Large Language Models (LLMs). Discuss how LLMs are trained, the underlying neural network structures, and the challenges faced during their development. Include detailed discussions on concepts like transformers, attention mechanisms, and fine-tuning, using examples from well-known models like GPT-4 and BERT. Use a technical and precise tone, and organize the content with subheadings and bullet points.”

Scoring

  1. Clarity: 10 (very clear)
  2. Context: 9 (good technical context)
  3. Specificity: 10 (specific technical aspects covered)
  4. Directness: 10 (direct request)
  5. Tone and Style: 10 (technical and precise tone specified)
  6. Examples: 10 (provides specific examples like GPT-4, BERT)
  7. Scope Limitation: 10 (well-defined technical scope)
  8. Formatting: 10 (suggests subheadings and bullet points)
  9. Relevance: 10 (relevant to technical understanding of LLMs)
  10. Creativity: 8 (technical but well-structured)

Total Score: 97/100

Refined Prompt 3: Impact and Ethical Considerations

“Write a comprehensive and thought-provoking post discussing the impact of Large Language Models (LLMs) on society and the ethical considerations associated with their use. Cover topics such as data privacy, potential biases in LLMs, and the implications for various sectors like healthcare, education, and entertainment. Include real-world examples, specific cases, and potential solutions to address ethical challenges. Use a balanced and reflective tone, and organize the content with subheadings and bullet points for clarity.”

Scoring

  1. Clarity: 10 (very clear)
  2. Context: 10 (excellent context)
  3. Specificity: 10 (specific ethical aspects covered)
  4. Directness: 10 (direct request)
  5. Tone and Style: 10 (balanced and reflective tone specified)
  6. Examples: 10 (mentions real-world examples and specific cases)
  7. Scope Limitation: 10 (well-defined scope covering ethical implications)
  8. Formatting: 10 (suggests subheadings and bullet points)
  9. Relevance: 10 (relevant to societal impact and ethics)
  10. Creativity: 8 (covers ethical challenges creatively)

Total Score: 98/100

Summary of Refined Prompts

  • Refined Prompt 1: 95/100
  • Refined Prompt 2: 97/100
  • Refined Prompt 3: 98/100

These refinements aim to improve the prompts’ scores by specifying tone and style, encouraging better organization with subheadings and bullet points, and enhancing clarity and engagement.